Tree-Based Classifier Ensembles for PE Malware Analysis: A Performance Revisit
نویسندگان
چکیده
Given their escalating number and variety, combating malware is becoming increasingly strenuous. Machine learning techniques are often used in the literature to automatically discover models patterns behind such challenges create solutions that can maintain rapid pace at which evolves. This article compares various tree-based ensemble methods have been proposed analysis of PE malware. A an unconventional paradigm constructs combines a collection base learners (e.g., decision trees), as opposed conventional paradigm, aims construct individual from training data. Several techniques, random forest, XGBoost, CatBoost, GBM, LightGBM, taken into consideration appraised using different performance measures, accuracy, MCC, precision, recall, AUC, F1. In addition, experiment includes many public datasets, BODMAS, Kaggle, CIC-MalMem-2022, demonstrate generalizability classifiers variety contexts. Based on test findings, all ensembles performed well, differences between algorithms not statistically significant, particularly when respective hyperparameters appropriately configured. The also outperformed other, similar detectors published recent years.
منابع مشابه
Using Multi-Feature and Classifier Ensembles to Improve Malware Detection
With the rapid growth of internet application, malware has become one of the major threats to information security. Traditionally, anti-virus products use signature matching to detect malware, but the drawback is that they can not detect new and unknown malware. Recent studies showed that the use of machine learning can successfully detect new and unknown malware, but the limitation of this tec...
متن کاملPE-Header-Based Malware Study and Detection
In this paper, I present a simple and faster apporach to distinguish between malware and legitimate .exe files by simply looking at properties of the MS Windows Portable Executable (PE) headers. We extract distinguishing features from the PEheaders using the structural information standardized by the Miscrosoft Windows operating system for executables. I use the following three methodology: (1)...
متن کاملA High-Performance Model based on Ensembles for Twitter Sentiment Classification
Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...
متن کاملConsensus-based combining method for classifier ensembles
In this paper, a new method for combining an ensemble of classifiers, called Consensus-based Combining Method (CCM) is proposed and evaluated. As in most other combination methods, the outputs of multiple classifiers are weighted and summed together into a single final classification decision. However, unlike the other methods, CCM adjusts the weights iteratively after comparing all of the clas...
متن کاملDesigning Classifier Ensembles with Constrained Performance Requirements
Classification requirements for real-world classification problems are often constrained by a given true positive or false positive rate to ensure that the classification error for the most important class is within a desired limit. For a sufficiently high true positive rate, this may result in the set-point being located somewhere in the flat portion of the ROC curve where the associated false...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Algorithms
سال: 2022
ISSN: ['1999-4893']
DOI: https://doi.org/10.3390/a15090332